October 24, 2024
Disclaimer: This presentation reflects my views and should not be construed to represent FDA’s views or policies.
Rare diseases pose challenges for conducting clinical trials, primarily due to the small patient population size. Thus, it may be difficult to recruit enough participants for adequately powered studies.
Traditional randomized controlled trials (RCTs) may be infeasible or impractical in rare disease research.
Small datasets can lead to imprecise parameter estimates and limit the power of statistical analyses.
We are starting to see more examples of using generative models to create clinical data
Generative models for imaging data –
See Sizikova & CDRH colleagues (2024)1 (GANs, diffusion models, deconvolutional models, VAEs, etc.)
AnimalGAN from NCTR (Chen et al. 20232) –
generation of synthetic clinical pathology measurements to assess toxicology of untested chemicals on animals
Digital Twins –
Unlearn.AI’s PROCOVA (prognostic covariate adjustment model)3
Synthetic control (Pennello & Thompson, 2008)4
Priors constructed using in silico models:
Prior distributions on latent weight parameters of DL models:
Challenge 2: Can incorporate uncertainty into the data generation process by using prior distributions on model parameters (leading to posterior distns on parameters).
Can generate diverse instances by sampling different parent parameters from the posterior distribution, then sampling instances given the parameters.
Traditional synthetic data generation methods may rely on point estimates of parameters and thus not fully capture the underlying data distribution.
Bayesian hierarchical models (BHMs) are natural frameworks for combining data across sources
Proof-of-concept Exercise*:
Generate synthetic data and combine it with “real” (simulated) data
*I do not necessarily think this is ideal
Example similar to the multi-modal synthetic data generation example from the original BGAN paper (Saatchi & Wilson, 2017).
\[\underset{5000 \times 30}{\mathbf{X}_1} \sim N(\mathbf{\mu}_1, \mathbf{\Sigma}) \hspace{2mm} \underset{5000 \times 30}{\mathbf{X}_2} \sim N(\mathbf{\mu}_2, \mathbf{\Sigma})\] \[\mathbf{\Sigma}: \hspace{2mm} \sigma_{ii} = 1, \hspace{1mm} \sigma_{ij} = 0.2 \] The mean vectors were either all 1s or all -1s. \[\mathbf{\mu}_1 = [1,...,1] \hspace{5mm} \mathbf{\mu}_2 = [-1,...,-1]\] I added 8 pairwise interactions to the 30 covariates:
\[\hspace{5mm} \underset{5000 \times 38}{\mathbf{X}_j} \leftarrow \underset{5000 \times 30}{\mathbf{X}_j} + \text{8 interactions}\]
I simulated a binary “response” vector using the \(\mathbf{X}\)s and a coefficient vector \(\mathbf{\beta}\) drawn from a MVN distribution with correlations of 0.2.
\[\underset{500 \times 1}{\mathbf{y}_1} \sim Bern(p = {(1+\operatorname{exp}( \mathbf{X}_1\beta)})^{-1})\]
\[\underset{500 \times 1}{\mathbf{y}_2} \sim Bern(p = {(1+\operatorname{exp}( \mathbf{X}_2\beta)})^{-1})\]
\[\underset{38 \times 1}{\mathbf{\beta}} \sim N(\mathbf{0},\Sigma) \hspace{3mm} \sigma_{ii} = 1, \hspace{1mm} \sigma_{ij} = 0.2\] Training and validation sets, with and 80/20 split.
\[ \underset{(80/20)}{\text{Training/valid set: }} \begin{bmatrix} \mathbf{X}_1 & \vdots & \mathbf{y}_1 \\ \hline \mathbf{X}_2 & \vdots & \mathbf{y}_2 \end{bmatrix} \]
Generator/discriminator networks were very similar to those used in the paper:
Compare learned representations across synthetic and real datasets.
Comparison of first 2 PCs of each of 10 synthetic datasets with validation dataset
Compare histograms of validation data and (all) generated data across all 39 variables
Each of the 10 synthetic datasets generated by the BayesGAN using different generator weight samples will be treated as a hypothetical “prior” study.
Suppose we obtain new dataset (test set, here) which we want to combine with the prior studies. (Qualitative comparisons with synthetic datasets were similar to previous slide).
In the BHM, there are 11 studies. The study-specific parameters will form level 2 of the model, and will borrow information from each other.
However, the data model used for observations in each study was different from that used in the generative model.
Bayesian hierarchical logistic regression model, where the synthetic datasets (\(j=1,...,10\)) serve as prior studies for the “real” study (\(j=test\)).
\[y_{ij} \sim Bern(p_{ij}) \hspace{5mm} i=1,...,n_j = 100; \hspace{3mm} j=1,...,10,\text{test}\] \[p_{ij} = (1+\operatorname{exp}(\alpha_j +\mathbf{x}_{ij}^{T}\mathbf{\beta}_j))^{-1} \]
\[\mathbf{\beta}_j \sim N(\mathbf{0},\mathbf{\Sigma}_{\beta}), \hspace{3mm} \alpha_j \sim N(0, 10) \hspace{3mm} j=1,...,10,\text{test}\] LKJ prior on the correlation matrix (then transform back to covariance matrix \(\mathbf{\Sigma_{\beta}}\))
\[Cor(\mathbf{\beta}) = \mathbf{R}_{\mathbf{\beta}} = L_\Omega^T L_\Omega \sim LKJ\_Corr(\eta = 0.8) \propto \operatorname{det}(\mathbf{R}_{\mathbf{\beta}})^{\eta - 1}\]
We want to make inference on \(\mathbf{\beta}_{test}\).
Bayesian versions of synthetic data generation are potentially fruitful topics for future research.
BHMs are natural structures for combining information across synthetic datasets and real data. But, more flexible versions may be needed.
Many newer generative models have natural levels of hierarchies that might be used (e.g., self-attention heads in multi-attention layers of transformers, generator networks in GANs)
NumPyro (numpy backend for pyro with JAX) can be useful for fitting HMs with large amounts of simulated data.
Szikova et al. (2024) Synthetic data in radiological imaging: current state and future outlook, BJR|Artificial Intelligence, 1(1)
Chen et al. (2023) AnimalGAN: A Generative Adversarial Network Model Alternative to Animal Studies for Clinical Pathology Assessment
Walsh et al. (2021) Using digital twins to reduce sample sizes while maintaining power and statistical accuracy, Alzheimer’s Dement. 2021;17(Suppl. 9):e054657
Pennello & Thompson (2008) Experience with reviewing Bayesian Medical Device Trials, Journal of Biopharmaceutical Statistics, 18:1, 81 - 115
Haddad et al. (2017) Incorporation of stochastic engineering models as prior information in Bayesian medical device trials, Journal of Biopharmaceutical Statistics, DOI: 10.1080/10543406.2017.1300907
Kiagias et al. (2021) Bayesian Augmented Clinical Trials in TB Therapeutic Vaccination, Front. Med. Technol. 3:719380.
Saatchi and Wilson (2017) Bayesian GAN. 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA
Chien & Kuo (2019) Variational Bayesian GAN, 2019 27th European Signal Processing Conference (EUSIPCO)
Alt et al. (2024) LEAP: the latent exchangeability prior for borrowing information from historical data, Biometrics, 80(3).
Reimherr et al. Prior sample size extensions for assessing prior impact and prior-likelihood discordance, J R Stat Soc Series B. 2021;83:413–437.
Comments on application of BHM to the example
The synthetic datasets from the BayesGAN may have been too diverse compared to the test dataset resulting in minimal borrowing across coefficient vectors.
A model with clusters of exchangeability may be more appropriate if some synthetic datasets are more similar to the current study than others. A Dirichlet process mixture model or LEAP model (Alt et al., 20249) could flexibly model alternatives to full exchangeability.
One could quantify how much was borrowed from synthetic datasets using prior effective sample size (PESS). PESS represents the amount of information contributed by the prior (synthetic datasets).
Several proposals for computing PESS. Approximation may be necessary for more complicated models. Reimherr et al. (2021)10 provide an approximation for a “multivariate” PESS.